89 research outputs found

    Leveraging Large Language Models and Weak Supervision for Social Media data annotation: an evaluation using COVID-19 self-reported vaccination tweets

    Full text link
    The COVID-19 pandemic has presented significant challenges to the healthcare industry and society as a whole. With the rapid development of COVID-19 vaccines, social media platforms have become a popular medium for discussions on vaccine-related topics. Identifying vaccine-related tweets and analyzing them can provide valuable insights for public health research-ers and policymakers. However, manual annotation of a large number of tweets is time-consuming and expensive. In this study, we evaluate the usage of Large Language Models, in this case GPT-4 (March 23 version), and weak supervision, to identify COVID-19 vaccine-related tweets, with the purpose of comparing performance against human annotators. We leveraged a manu-ally curated gold-standard dataset and used GPT-4 to provide labels without any additional fine-tuning or instructing, in a single-shot mode (no additional prompting)

    Solar Event Tracking with Deep Regression Networks: A Proof of Concept Evaluation

    Full text link
    With the advent of deep learning for computer vision tasks, the need for accurately labeled data in large volumes is vital for any application. The increasingly available large amounts of solar image data generated by the Solar Dynamic Observatory (SDO) mission make this domain particularly interesting for the development and testing of deep learning systems. The currently available labeled solar data is generated by the SDO mission's Feature Finding Team's (FFT) specialized detection modules. The major drawback of these modules is that detection and labeling is performed with a cadence of every 4 to 12 hours, depending on the module. Since SDO image data products are created every 10 seconds, there is a considerable gap between labeled observations and the continuous data stream. In order to address this shortcoming, we trained a deep regression network to track the movement of two solar phenomena: Active Region and Coronal Hole events. To the best of our knowledge, this is the first attempt of solar event tracking using a deep learning approach. Since it is impossible to fully evaluate the performance of the suggested event tracks with the original data (only partial ground truth is available), we demonstrate with several metrics the effectiveness of our approach. With the purpose of generating continuously labeled solar image data, we present this feasibility analysis showing the great promise of deep regression networks for this task.Comment: 8 pages, 5 figures, this has been submitted and accepted for publication at IEEE Big Data 2019 - SABID Worksho

    Pulse of the Pandemic: Iterative Topic Filtering for Clinical Information Extraction from Social Media

    Full text link
    The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency. One surprisingly effective platform for healthcare professionals (HCPs) to share knowledge and experiences from the front lines has been social media (for example, the "#medtwitter" community on Twitter). However, identifying clinically-relevant content in social media without manual labeling is a challenge because of the sheer volume of irrelevant data. We present an unsupervised, iterative approach to mine clinically relevant information from social media data, which begins by heuristically filtering for HCP-authored texts and incorporates topic modeling and concept extraction with MetaMap. This approach identifies granular topics and tweets with high clinical relevance from a set of about 52 million COVID-19-related tweets from January to mid-June 2020. We also show that because the technique does not require manual labeling, it can be used to identify emerging topics on a week-to-week basis. Our method can aid in future public-health emergencies by facilitating knowledge transfer among healthcare workers in a rapidly-changing information environment, and by providing an efficient and unsupervised way of highlighting potential areas for clinical research.Comment: 24 pages, 5 figures. To be published in the Journal of Biomedical Informatic

    A Large-Scale COVID-19 Twitter Chatter Dataset for Open Scientific Research-An International Collaboration

    Get PDF
    Ajuts: This work was partially supported by the National Institute of Aging through Stanford University's Stanford Aging and Ethnogeriatrics Transdisciplinary Collaborative Center (SAGE) center (award 3P30AG059307-02S1). The work on the collection of Russian tweets was performed by Elena Tutubalina and supported by the Russian Science Foundation (grant number 18-11-00284).As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others

    Preliminary results on the application of the aminoacid racemization technique in the Murcia Region (SE Iberian Peninsula) and their interest in paleoseismological research

    Get PDF
    Geochronology is a critical issue in paleoseismological research. The aminoacid racemization technique shows important advantages respect to more traditional dating methods; not just for the lower costs and promptness, also because the object to analyze is relatively frequent, in this study: terrestrial gastropods. Furthermore, the costs of the analysis are by far faster and cheaper compared to other dating techniques. Racemization results allow comparing the relative age from different sedimentary units from one trench to another.Additionally, the racemization technique can also be used as a geochronological tool, provided a calibration curve has been first obtained for the particular climate for the area and, ideally, for a particular genus. In this study we show the results obtained from the analysis of 40 samples of terrestrial gastropods from 7 different trenches located in the Murcia Region (SE Spain). Making use of the D/L ratio of aspartic acid we show the coherence found between relative stratigraphic ages and the racemization age. Finally, we show a provisional conversion equation between age of racemization, obtained from Torres et al. (1997) algorithm, and the likely age of the samEl control geocronológico es una cuestión crítica en los estudios de paleosismología. La técnica de racemización de aminoácidos ofrece importantes ventajas respecto a otros métodos de datación, tanto en los costes y rapidez, como en la facilidad de encontrar el objeto de análisis; en este estudio, gasterópodos terrestres. Los resultados permiten comparar la edad relativa entre unidades sedimentarias diferentes de unas zanjas a otras. La técnica de racemización también es una herramienta geocronológica, si bien es necesario primero establecer una curva de calibración para el ambiente climático de la zona e, idealmente, para un género concreto. En este estudio se muestran los resultados obtenidos en 40 muestras de gasterópodos terrestres recogidas en 7 zanjas de investigación paleosismológica en la Región de Murcia. Haciendo uso de la relación D/L del ácido aspártico mostramos la coherencia entre las edades relativas estratigráficas y su edad de racemización. Finalmente, proponemos una relación provisional de conversión entre las edades de racemización obtenidas por el algoritmo de Torres et al. (1997) para gasterópodos de la zona central de la Península Ibérica y la edad probable de las muestra

    Primeros resultados sobre la aplicación de la técnica de racemización de aminoácidos en la Región de Murcia (SE de la Península Ibérica) y su interés en estudios de paleosismología

    Full text link
    Geochronology is a critical issue in paleoseismological research. The aminoacid racemization technique shows important advantages respect to more traditional dating methods; not just for the lower costs and promptness, also because the object to analyze is relatively frequent, in this study: terrestrial gastropods. Furthermore, the costs of the analysis are by far faster and cheaper compared to other dating techniques. Racemization results allow comparing the relative age from different sedimentary units from one trench to another.Additionally, the racemization technique can also be used as a geochronological tool, provided a calibration curve has been first obtained for the particular climate for the area and, ideally, for a particular genus. In this study we show the results obtained from the analysis of 40 samples of terrestrial gastropods from 7 different trenches located in the Murcia Region (SE Spain). Making use of the D/L ratio of aspartic acid we show the coherence found between relative stratigraphic ages and the racemization age. Finally, we show a provisional conversion equation between age of racemization, obtained from Torres et al. (1997) algorithm, and the likely age of the samples. RESUMEN: El control geocronológico es una cuestión crítica en los estudios de paleosismología. La técnica de racemización de aminoácidos ofrece importantes ventajas respecto a otros métodos de datación, tanto en los costes y rapidez, como en la facilidad de encontrar el objeto de análisis; en este estudio, gasterópodos terrestres. Los resultados permiten comparar la edad relativa entre unidades sedimentarias diferentes de unas zanjas a otras. La técnica de racemización también es una herramienta geocronológica, si bien es necesario primero establecer una curva de calibración para el ambiente climático de la zona e, idealmente, para un género concreto. En este estudio se muestran los resultados obtenidos en 40 muestras de gasterópodos terrestres recogidas en 7 zanjas de investigación paleosismológica en la Región de Murcia. Haciendo uso de la relación D/L del ácido aspártico mostramos la coherencia entre las edades relativas estratigráficas y su edad de racemización. Finalmente, proponemos una relación provisional de conversión entre las edades de racemización obtenidas por el algoritmo de Torres et al. (1997) para gasterópodos de la zona central de la Península Ibérica y la edad probable de las muestras
    corecore